# Unified Visual Representation
Florence 2 Large
MIT
Florence-2 is an advanced vision foundation model developed by Microsoft, using a prompt-based approach to handle a wide range of vision and vision-language tasks.
Image-to-Text
Transformers

F
Binaryy
24
0
Florence 2 Large Ft Fix
MIT
Florence-2 is an advanced vision foundation model developed by Microsoft, employing a prompt-based approach to handle a wide range of visual and vision-language tasks.
Image-to-Text
Transformers

F
AdithyaSK
23
0
Florence 2 Large
MIT
Florence-2 is an advanced vision foundation model developed by Microsoft, employing a prompt-based approach to handle a wide range of visual and vision-language tasks.
Image-to-Text
Transformers

F
lodestone-horizon
14
0
Florence 2 Base
MIT
Florence-2 is an advanced vision foundation model developed by Microsoft, employing a prompt-based approach to handle a wide range of vision and vision-language tasks.
Text-to-Image
Transformers

F
microsoft
316.74k
264
Chat UniVi 7B V1.5
Chat-UniVi is a large language model with unified visual representation, capable of understanding both images and video content.
Image-to-Text
Transformers

C
Chat-UniVi
649
2
Chat UniVi 13B
Chat-UniVi is a unified visual representation large language model capable of understanding both image and video content.
Image-to-Text
Transformers

C
Chat-UniVi
57
9
Featured Recommended AI Models